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Abstract 

A model for the phenomenological description of tick-by-tick share prices in a stock 
exchange is introduced. It is based on mixtures of compound Poisson processes. 
Preliminary results based on Monte Carlo simulation show that this model can 
reproduce various stylized facts. 
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1 Introduction 

Continuous time random walks (CTRWs) were introduced in Physics by Mon- 
troll and Weiss as a model for single-particle (tracer) diffusion [1] . An instance 
of CTRW, the normal compound Poisson process, had already been used in 
the probabilistic theory of insurance ruin since the beginning of the XXth 
Century [2,3]. 

The seminal paper of Montroll and Weiss has been followed by many studies 
focusing on anomalous relaxation and anomalous diffusion. This is the subject 
of two recent reviews by Metzler and Klafter [4,5]. 

The present author has recently reviewed the applications of CTRWs to Fi- 
nance and Economics [6]. These applications were triggered by a series of 
papers on finance and fractional calculus [7,8,9], but the reader is referred to 
ref. [10] for an early application of the normal compound Poisson process to 
financial data. 



Preprint submitted to Elsevier Science 



2 February 2008 



The recent research of the present author has focused on the behaviour of 
waiting times (also known as durations) between trades and order in financial 
markets [8,11,12,13]. It turned out that interorder and intertrade waiting times 
are not exponentially distributed. Therefore, the jump process of tick-by-tick 
prices is non-Markovian [8]. 

In an article within this issue [14], Bianco and Grigolini apply a new method 
to verify whether the intertrade waiting time process is a genuine renewal 
process [15,16,17]. This was assumed by the CTRW hypothesis in [7]. They 
find that intertrade waiting times follow a renewal process. 

Here, inspired by the work of Edelman and Gillespie [18,19], a phenomeno- 
logical model for intraday tick-by-tick financial data is presented. It is able to 
reproduce some important stylized facts. The paper is organized as follows. 
Section 2 contains an outline of the theory of CTRWs. Section 3 contains a 
description of the model as well as a discussion on results from Monte Carlo 
simulations. 



2 Outline of theory 

2.1 Basic definitions 

CTRWs are point processes with reward. The point process is characterized 
by a sequence of independent identically distributed (i.i.d.) positive random 
variables Tj, which can be interpreted as waiting times between two consecutive 
events: 

n 

tn = to + ^Ti] t n -t n -i=r n ; n— 1,2,3,...; t — 0. (1) 
i=i 

The rewards are i.i.d. not necessarily positive random variables: £j. In the usual 
physical intepretation, they represent the jumps of a random walker, and they 
can be n-dimensional vectors. In this paper, only the 1-dimensional case is 
studied for a continuous random variable, but the extension of many results 
to the n-dimensional case and to a lattice is straightforward. The position x 
of the walker at time t is (with N(t) = max{n : t n < t} and x(0) = 0): 

N(t) 

*(t) = £ (2) 

CTRWs are rather good and general phenomenological models for diffusion, in- 
cluding anomalous diffusion, provided that the time of residence of the walker 
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is much greater than the time it takes to make a jump. In fact, in this formal- 
ism, jumps are instantaneous. 

The financial interpretation of the random variables is straightforward. If 
trades take place in a continuous double auction, both price variations and 
waiting times (also called durations) between two consecutive trades are ran- 
dom variables. If S(t) is the price of an asset at time t defined according to the 
previous tick interpolation procedure, S(t) = S(ti) where U is the time instant 
at which the last trade took place, then the price process can be considered 
as a pure jump stochastic process in continuous time. In finance, it is better 
to work with returns rather that prices. If S(0) is the price at time t — 0, 
then the variable x(t) = log (S(t)/S(0)) is called the log-return or, better, the 
log-price. This variable is analogous to the position of the walker in the physi- 
cal interpretation. In the financial intepretation the jump random variables £j 
are tick-by-tick log returns and they coincide with the difference between two 
consecutive log prices, whereas the waiting times or durations 7* denote the 
elapsed time between two consecutive trades. 

In general, jumps and waiting times are not independent from each other. In 
any case, a CTRW is characterized by the joint probability density </?(£, r) of 
jumps and waiting times; </?(£, r) all; dr is the probability of a jump to be in 
the interval (£, £ + d£) and of a waiting time to be in the interval (r, r + dr). 
The following integral equation gives the probability density, p(x,t), for the 
walker being in position x at time t, conditioned on the fact that it was in 
position x = at time t — 0: 

rt r+oo 

p(x,t) = 6(x)V(t)+ / (p(x-x',t-t')p(x',t')dt'dx' : (3) 

JO J-oo 

where ^(r) is the so-called survival function, ^(r) is related to the marginal 
waiting-time probability density ipij). The two marginal densities ipij) and 
A(0 are: 

/+oo 
^r)di 
-oo 

POO 

A(0=/ ^,r)dr, (4) 
and the survival function ^(r) is defined as: 

rT roo 

*(r) = 1 - y <K r ') dr' = J 4){t') dr'. (5) 

The integral equation, eq. (3) is linear and it can be solved in the Laplace- 
Fourier domain. The Laplace transform, g(s) of a (generalized) function g(t) 
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is defined as: 

r+oo 



r+OO 

g(s)= dte- st g(t), (6) 
Jo 

whereas the Fourier transform of a (generalized) function f(x) is defined as: 

/(«)=/ dxe^f(x). (7) 

J — oo 

A generalized function is a distribution (like Dirac's 5) in the sense of S. L. 
Sobolev and L. Schwartz [20]. 

One gets: 

P( K , 8 )=*( 8 )—^L -, (8) 

1 - <p(K, s) 

or, in terms of the density i/)(t): 



^.s) ^-f S \ i . , (9) 



as, from eq. (5), one has: 



ns) = (io) 



In order to obtain p(x,t), it is then necessary to invert its Laplace- Fourier 
transform p(k,s). Analytic solutions are quite important, as they provide a 
benchmark for testing numerical inversion methods. In the next section, an 
explicit analytic solution for a class of continuous-time random walks with 
anomalous relaxation behaviour will be presented. It will be necessary to re- 
strict oneself to the uncoupled case, in which jumps and waiting-times are not 
correlated. 



2.2 The normal compound Poisson process 



In this section, the solution of eq. (3) will be derived in the uncoupled case 
where the joint probability density of jumps and durations can be factorized 
in term of its marginals. After the derivation of a genearal formula for p(x, t), 
this will be specialized to the case of the normal compound Poisson process 
(NCPP). 

If jump sizes do not depend on waiting times, the joint probability density for 
jumps and waiting times can be written as follows: 

¥>(£,r)=A(0lKr) (11) 
with the normalization conditions / c?£A(£) = 1 and / drip(r) = 1. 
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In this case the integral master equation for p(x, t) becomes: 

ft r r+oo 

p(x,t) = 6(x)V(t)+ ip(t-t') / X(x - x')p(x',t')dx' 

JO U-oo 



dt' (12) 



This equation has a well known general explicit solution in terms of P(n,t), 
the probability of n jumps occurring up to time t, and of the n-fold convolution 
of the jump density, X n (x): 

/+oo r+oo r+oo 
/ • • • / dt n -id£n-2 ■ ■ ■ d6LA(a;-f n _i)A(f n _i-f n _2) . . . A(£l). 
-oo J —co J— oo 

(13) 

Indeed, P(n, t) is given by: 

P(n,t)= f \p n (t-r)^(r)dr (14) 
Jo 

where ip n ij) is the n-fold convolution of the waiting-time density: 

rr rTn-i er\ 

^n(-r) = / / .../ rfr n _irfr n _ 2 ---rfri^(t-r„_i)^(r„_i-r„_ 2 )---V ; (^i)- 
Jo Jo Jo 

(15) 

The n-fold convolutions defined above are probability density functions for 
the sum of n variables. 

The Laplace transform of P(n,t), P(n, s), reads: 

P(n,a) = (16) 
By taking the Fourier-Laplace transform of eq. (12), one gets: 

a) = j^TT- (17) 

1 — ip(s)X{K) 

But, recalling that \X(k)\ < 1 and \ip(s)\ < 1, if k ^ and s ^ 0, eq. (17) 
becomes: 

oo 

= J2m^)] n ; (is) 

n=0 

this gives, inverting the Fourier and the Laplace transforms and taking into 
account eqs. (13) and (14): 

oo 

p(x,t) = J2 P (n,t)\ n (x) (19) 

n=0 

Eq. (19) can also be used as the starting point to derive eq. (12) via the 
transforms of Fourier and Laplace, as it describes a jump process subordinated 
to a renewal process. 
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A remarkable analytic solution is available when the waiting-time probability 
density function has the following exponential form: 

^(r) = fie-^. (20) 

Then, the survival probability is ^(r) = e _MT and the probability of n jumps 
occurring up to time t is given by the Poisson distribution: 



P(n,t) = ^f e""*. (21) 
This is the only Markovian case, and equation (19) becomes: 



If A(£) follows the normal distribution iV(£; £, c^), then the n-fold convolution 
is given by: X n (x) = N(x; n£, y/na^). 

Given a series of empirical tick- by-tick log returns, {d}^ , as well as durations, 
{rj}^, one can directly evaluate the three parameters /x: the activity of the 
Poisson process, £: the average of log-returns, and a^: the standard deviation 
of log-returns by means of suitable estimators [10]. 

However, the normal compound Poisson process is not able to reproduce the 
following stylized facts on high frequency data: 

(1) The empirical distribution of log-returns is leptokurtic, whereas the NCPP 
assumes a mesokurtic (actually normal) distribution. 

(2) The empirical distribution of durations is non-exponential with excess 
standard deviation [21,22,8,11,12], whereas the NCPP assumes an expo- 
nential distribution. 

(3) The autocorrelation of absolute log-returns decays slowly [11], whereas 
the NCPP assumes i.i.d. log-returns. 

(4) Log-returns and waiting times are not independent [11,23], whereas the 
NCPP assumes their independence. 

(5) Volatility and activity vary during the trading day [24], whereas the 
NCPP assumes they are constant. 
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3 Mixtures of normal compound Poisson processes 

3. 1 Definition 

It is possible to overcome the above shortcomings by using a suitable mixture 
of NCPPs. During a trading day, the volatility and the activity are higher at 
the opening of the market, then they decrease at midday and they increase 
again at market closure [24]. If the trading day can be divided into T intervals 
of constant activity {/j,i}f =1 , then the waiting-time distribution is a mixture 
of exponential distributions and its probability density can be written as: 



where {a,i}J =l is a set of suitable weights. The activity seasonality can be 
mimicked by values of /ij that decrease towards midday and then increase 
again towards market closure. In order to reproduce the correlation between 
volatility and activity, one can assume that: 



where c is a suitable constant. Future work will be devoted to an analytical 
study of this model as well as to further empirical investigations on model 
validation. Below, the results of a simulation performed with the model are 
presented, and the performance of the model with respect to the stylized facts 
is discussed. 



3.2 Results 

A Monte Carlo simulation of the model described in the previous subsection 
has been performed by considering a trading day divided into ten intervals of 
constant activity with {MEi = V 10 > V 20 > 1/30, 1/40, 1/50, 1/40, 1/30, 1/20, 
1/15,1/10 s^ 1 . For each value of fii, 100 exponentially distributed waiting 
times have extracted as well as 100 normally distributed log-returns with zero 
average and a^ ti = 0.001 • Therefore, there are 1000 values of waiting 
times and log returns in a trading day, representing a rather liquid share. The 
opening price is set to 100 arbitrary units (a.u.). In Fig. 1, a sample path is 
plotted for the price as a function of trading time. Fig. 2 and Fig. 3 represent 
the tick-by-tick time series of log-returns and waiting times respectively. For 
this particular simulation, the effect of variable activity and volatility can be 
detected by direct eye inspection. 



T 




(23) 



i=l 



(24) 
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Fig. 1. Simulated price as a function of transaction time. The initial price is set to 
100 arbitrary units (a.u.). Simulation times are measured in seconds 
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Fig. 2. A simulated absolute log-return series. 
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Fig. 3. A simulated waiting time series. 

In order to show that this model is able to reproduce the stylized facts de- 
scribed above, another set of figures is presented in the following. In Fig. 4 
the empirical complementary cumulative distribution function is plotted for 
absolute tick-by-tick log returns. For comparison, the Gaussian fit with the 
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Fig. 4. Empirical complementary cumulative distribution function for absolute log 
returns (circles). The solid line is a Gaussian fit. 
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Fig. 5. Empirical survival probability. 

same standard deviation of the 1000 log-returns is given by a solid line. This 
distribution has fat tails, is leptokurtic and the kurtosis is equal to 6. 

The empirical complementary cumulative distribution function for intertrade 
durations is given in Fig. 5. The solid line is the single exponential fit to the 
simulated data. There is excess standard deviation: the standard deviation of 
waiting times is 29 s, whereas the average waiting time is 25 s. 

Fig. 6 shows the slow decay of the autocorrelation of absolute log-returns 
related to volatility clustering, whereas signed log-returns are zero already at 
the second lag. 

In conclusion, the model based on mixtures of normal compound Poisson pro- 
cesses incorporates variable daily activity, as well as the dependence between 
durations and tick- by-tick log-returns via eq. (24). It is then able to replicate 
the following stylized facts: 

• The empirical distribution of log-returns is leptokurtic; 
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lag 

Fig. 6. Estimate of the autocorrelation function for absolute log-returns (.-), and 
signed log-returns (-). The solid horizontal lines represent the statistical zero level 
( ± 3/v/TOOO). 

• the empirical distribution of durations is non-exponential with excess stan- 
dard deviation; 

• The autocorrelation of absolute log-returns decays slowly. 

Work is currently in progress to empirically validate the model [25] . 
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